Cs 59000 Ctt Current Topics in Theoretical Cs Lecture 12

نویسندگان

Elena Grigorescu

Jeff Gaither

چکیده

Our goal is to store S (a set of keys) in such a way that we can quickly (i.e. with few memory accesses) answer queries of the form “Is i ∈ S?”. So we want to encode S in a table via some encoding scheme S → E(S). An algorithm for the membership problem takes a query “Is i ∈ S” and makes a few ‘probes’ into the table E(S), where a probe is just an index (cell) in the table. Based on the values seen at the probed locations the algorithm should correctly answer ‘yes’ or ‘no’. This is a fundamental problem in data structures. Yao [3] showed that if the keys are stored explicitly (so we have n keys of size log m) then the best one can do is to store the set S in sorted order on a given query perform binary search to find the answer. This means that E(S) has size n log m bits, and our search could require reading log n keys, so log n log m bits, which is somewhat inefficient. In the so-called ‘cell-probe model’ introduced by Yao [3], a cell contains a number of bits and the time complexity of the scheme is counted in terms of the number of cells probed (as opposed to the number of bits seen). Fredman et al [2] conceived a scheme which improved on the sorted-storage approach, in that it required only a constant number c of probes. They stored their data in a table of size n cells (of size log m bits each) but the number of probes needed is only a constant c (so the algorithm sees c log m bits). Another well-studied model is the ‘bit-probe’ model, where a cell contains only one bit. So in this model the goal is to store the set S in a small bit-string (hopefully close to informationtheoretic optimal) but such that any query can be answered with only a few bit-probes. What is the information-theoretic minimum number of bits needed to store sets S of size ≤ n from U = [m]? Since there are ∑n i=1 ( n k ) distinct sets, it follows that these sets can be uniquely represented with Ω(n log m) bits. So the goal of a storing scheme is to get as close to n log m bits of storage as possible, but also be able to efficiently answer any query. In today’s lecture we’ll construct a scheme where the data structure has O(n log m) bits and queries can be answered correctly with high probability by making only one bit-probe into the encoding.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cs 59000 Ctt Current Topics in Theoretical Cs

We introduced error-correcting codes and linear codes in the last lecture. In this lecture we will discuss in some more details properties of linear codes and we’ll describe classical examples of linear codes. We will also show the Hamming bound, which is a bound that relates the distance, rate, and block length parameters of codes and is tight for the Hamming codes. Recall that a (n, k, d)q co...

متن کامل

Cs 59000 Ctt Current Topics in Theoretical Cs

There are many notions of expansion and in previous lectures we have seen edge and spectrum expansion. In this lecture, we will see the notion of vertex expansion. Many of these notions of expansion are in fact equivalent, but we won’t discuss in this course why. Our goal today is to show that expanders exist using the probabilistic method. According to the probabilistic method, in order to sho...

متن کامل

Cs 59000 Ctt Current Topics in Theoretical Cs Lecture 1

The use of randomness in designing algorithms that deal with large data sets can lead to vast improvements in performance compared to deterministic algorithms for the same problems. In this course we will be looking at a few computational models that use randomness as a means to finding efficient or super-efficient algorithms for various computational problems specific to graphs, error-correcti...

متن کامل

Research in computer science: an empirical study

In this paper, we examine the state of computer science (CS) research from the point of view of the following research questions: 1. What topics do CS researchers address? 2. What research approaches do CS researchers use? 3. What research methods do CS researchers use? 4. On what reference disciplines does CS research depend? 5. At what levels of analysis do CS researchers conduct research? To...

متن کامل

Theoretical models for determination of weight percent of PHCS-g-PLLA co polymer using experimental data

The amphiphilic graft copolymer using chitosan (CS) as hydrophilic segment and poly (L-lactic acid) (PLLA) as hydrophobic segment, was prepared through a protection-graft-de protection route. Chitosan is a polysaccharide comprising of copolymers of glucosamine and N-acetyl glucosamine. Chitosan is the deacetylated derivative of chitin, which is one of the most abundant natural polysaccharides c...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Cs 59000 Ctt Current Topics in Theoretical Cs Lecture 12

نویسندگان

چکیده

منابع مشابه

Cs 59000 Ctt Current Topics in Theoretical Cs

Cs 59000 Ctt Current Topics in Theoretical Cs

Cs 59000 Ctt Current Topics in Theoretical Cs Lecture 1

Research in computer science: an empirical study

Theoretical models for determination of weight percent of PHCS-g-PLLA co polymer using experimental data

عنوان ژورنال:

اشتراک گذاری